Gradient Boosted Large Margin Nearest Neighbors
نویسندگان
چکیده
A fundamental question of machine learning is how to compare examples. If an algorithm could perfectly determine whether two examples were semantically similar or dissimilar, most subsequent machine learning tasks would become trivial (i.e, the 1-nearest-neighbor classifier will achieve perfect results). A common choice for a dissimilarity measurement is an uninformed norm, like the Euclidean distance. While simple, it does not necessarily represent the similarity in the problem’s domain.
منابع مشابه
Feature Engineering in User's Music Preference Prediction
The second track of this year’s KDD Cup asked contestants to separate a user’s highly rated songs from unrated songs for a large set of Yahoo! Music listeners. We cast this task as a binary classification problem and addressed it utilizing gradient boosted decision trees. We created a set of highly predictive features, each with a clear explanation. These features were grouped into five categor...
متن کاملAn Empirical Evaluation of Supervised Learning for ROC Area
We present an empirical comparison of the AUC performance of seven supervised learning methods: SVMs, neural nets, decision trees, k-nearest neighbor, bagged trees, boosted trees, and boosted stumps. Overall, boosted trees have the best average AUC performance, followed by bagged trees, neural nets and SVMs. We then present an ensemble selection method that yields even better AUC. Ensembles are...
متن کاملA Gradient-Based Metric Learning Algorithm for k-NN Classifiers
The Nearest Neighbor (NN) classification/regression techniques, besides their simplicity, are amongst the most widely applied and well studied techniques for pattern recognition in machine learning. A drawback, however, is the assumption of the availability of a suitable metric to measure distances to the k nearest neighbors. It has been shown that k-NN classifiers with a suitable distance metr...
متن کاملMixtures of Large Margin Nearest Neighbor Classifiers
The accuracy of the k-nearest neighbor algorithm depends on the distance function used to measure similarity between instances. Methods have been proposed in the literature to learn a good distance function from a labelled training set. One such method is the large margin nearest neighbor classifier that learns a global Mahalanobis distance. We propose a mixture of such classifiers where a gati...
متن کاملNearest Neighbors with Learned Distances for Phonetic Frame Classification
Nearest neighbor-based techniques provide an approach to acoustic modeling that avoids the often lengthy and heuristic process of training traditional Gaussian mixturebased models. Here we study the problem of choosing the distance metric for a k-nearest neighbor (k-NN) phonetic frame classifier. We compare the standard Euclidean distance to two learned Mahalanobis distances, based on large-mar...
متن کامل